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@ A process for the production of transgenic plants with increased nutritional value via the expression of modified 2S 
storage albumins in said plants. 

(g) The invention pertains to a process for producing 

transgenic plants with increased nutritional value. It comprises : 

- cultivating plants obtained from regenerated plant cells or 

from seeds of plants obtained from said regenerated plant cells 

over one or several generations, whose genetic patrimony, 

replicable with said plants, comprises a precursor-coding 

nucleic acid sequence encoding the precursor of a 2S albumin 

storage protein and placed under the control of a promoter 
T" capable of directing gene expression in plants, said precursor- 
^ coding nucleic acid being modified in a nonessential region of 

its relevant sequence which encodes the mature 2S albumin or 
Y" a subunit thereof with a nucleic acid insert in appropriate 
^* reading frame relationship with the surrounding part of said 
W relevant sequence, said insert including a determined segment 
m encoding an heterologous determined polypeptide containing 
^ appropriate aminoacid such as lysine and/or methionine and/or 
^ threonine and/or phenylalanine and/or tryptophane and/or 

leucine and/or valine and/or isoleucine. 
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Description 

A process for the production of transgenic plants with increased nutritional value via the xpression of 

modified 2S storage albumins 

This invention relates to a process for the production of plants with increased content of appropriate 
5 . aminoacids having high nutritional properties through the modification of plant genes encoding plant storage 
proteins, more particularly the 2S albumins. 

More particularly, the invention aims at providing genetically modified plant DNA and plant live material 
including said genetically modified DNA replicable with the cells of said plant material, which genetically 
modified plant DNA contains sequences encoding for a polypeptide containing said appropriate aminoacids 
10 which expression is under the control of a suitable plant promoter. 

A further object of the invention is to take advantage of the capacity of 2S albumins to be produced in large 
amounts in plants. 

A further object of the invention is to take advantage of a hypervariable region of the 2S albumins, which 
supplementation with a number of said appropriate aminoacid codons in said hypervariable region of the gene 

75 encoding said 2S albumins, do not disturb the correct expression, processing and transport of said produced 
modified storage proteins in the protein bodies of the plants. 

Animals and men obtain directly or indirectly their essential aminoacids by eating plants. These essential 
aminoacids include lysine, thryptophane, threonine, methionine, phenylalanine, leucine, valine and isoteucine. 
For the easiness of the language these aminoacids are called "appropriate aminoacids". Rather recently, 

20 agricultural scientists concerned with the world's hungry problem, concentrated their work on developing 
plants with high nutritional yield. These new varieties, obtained through breeding in the most cases, were 
richer in carbohydrates but usually poorer in essential proteins than the wild type varieties from which they 
were derived. Currently, increasing recognition of the role of plants in supplying essential aminoacids to the 
animal world had led to emphasis on the development of new food plants having a better aminoacid content. 

25 Classical breeding however has limitations for achieving this goal. Molecular genetics, on the contrary, offers a 
possibility to overcome these difficulties. Reference is made to the European patent application 80208418 and 
the communication of Brown et al., 1986, in which a gene encoding a corn seed storage protein, (the so called 
zeins) is modified by the addition of sequences encoding lysine codons. 
Seed storage proteins represent up to 90% of total seed protein in seeds of many plants. They are used as a 

30 source of nutrition for young seedlings in the period immediately after germination. The genes encoding them 
are strictly regulated, being expressed in a highly tissue specific and stage specific fashion (Walling et al., 
19e6; Higgins, 1984). Thus they are expressed almost exclusively in developing seed, and different classes of 
seed storage proteins may be expressed at different stages in the development of the seed. They are generally 
restricted in their intercellular location, being stored in membrane bound organelles called protein bodies or 

35 protein storage vacuoles. These organelles provide a protease-free environment, and often also contain 
protease inhibitors. A related group of proteins, the vegetative storage proteins, have similar aminoacid 
compositions and are also stored in specialized vacuoles, but are found in leaves instead of in seeds 
fStaswick, 1988). These proteins are degraded upon flowering, and are thought to serve as a nutritive source 
for developing seeds. 

40 The expression of foreign genes in plants is well established (De Blaere et al., 1987). In several cases seed 
storage protein genes have been transferred to other plants. In most of these cases it was shown that within 
its new environment the transferred seed storage protein gene is expressed in a tissue specific and 
developmental^ regulated manner (Beachy et al., 1985; Sengupta-Gopalan et al., 1985; Marris et al., 1988; Ellis 
et al.. 1988; Higgins et a!., 1986, Okamuro et al., 1986). It has also been shown in at least two cases that foreign 

45 seed storage proteins are located in the protein bodies of the host plant (Greenwood and Chrispeels, 1985; 
Hoffman et al. ; 1987). It has further been shown that stable and functional messenger RNA's can be obtained if 
a cDNA, rather than a complete gene including introns, is used as the basis for the chimeric gene (Chee et al., 
1986). 

Storage proteins are generally classified on the basis of solubility and size (more specifically sedimentation 
50 rate, for instance as defined by Svedberg (in Stryer, U Biochemistry, 2nd ed. t W.H. Freeman, New York, page 
599)). A particular class of seed storage proteins has been studied, the 2S seed storage proteins, which are 
water soluble albumins. They represent a significant proportion of the seed storage proteins in many plants 
•'Youle and Huang, 1981) (Table I) and their small size and consequently simpler structure makes them an 
attractive target for modification (see also patent application EP 87 402 348.4). Several 2S storage proteins 
55 have been characterized at either the protein, cDNA or genomic clone levels (Crouch et al., 1983; Sharief and 
Li. 1982; Ampe et al., 1986; Altenbach et al., 1987; Ericson et al., 1986; De Castro et al., 1987; Scofield and 
Crouch, 1987; Josef sson et al., 1987; EP 87.4023484, Krebbers et al., 1988). 2S albumins are formed in the cell 
from two subunits of 6-9 and 3-4 kilodaltons (kd) respectively, which are linked by disulfide bridges. 
The work in the references above showed that 2S albumins are synthesized as complex prepropeptides 
60 whose organization is shared between the 2S albumins of many different species and are shown 
diagrammatically for three of these species in figure 1. Several complete sequences are shown in figure 2. 

As to Fig. 2 relative to protein sequences of 2S albumins, the following observations are made. For B. napus , 
B. e xcelsia , and A. thaliana both the protein and DNA sequences have been determined, for R. communis only 
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the protein sequence is available (B. napus from Crouch et at., 1983 and Ericson et al., 1986; B. excelsia from 
Ampe et al.. 1986, De Castro et al., 1987 and Altenbach et al., 1987, R. communis from Sharief and Li, 1982). 
Boxes indicate homologies, and raised dots the position of the cysteines. 

.Comparison of the protein sequences at the beginning of the precursor with standard consensus 
sequences for signal peptides reveals that the precursor has not one but two segments at the amino terminus 5 
which are not present in the mature protein, the first of which is a signal sequence (Perlman and Halvorson, 
1983) and the second of which has been designated as the amino terminal processed fragment (the so-called 
ATPF). Signal sequences serve to ensure the cotranslational transport of the nascent polypeptide across the 
membrane of the endoplasmic reticulum (Blobel, 1980), and are found in many types of proteins, including all 
seed storage proteins examined to date (Herman et aL, 1986). This is crucial for the appropriate 10 
compartmentalization of the protein. The protein is further folded in such a way that correct disulfide bridges 
are formed. This process is probably localized at the luminal site of the endoplasmatic reticulum membrane, 
where the enzyme disulfide isomerase. is localized (Roden et aL, 1982; Bergman and Kuehl, 1979). After 
translocation across the endoplasmic reticulum membrane it is thought that most storage proteins are 
transported via said endoplasmic reticulum to the Golgi bodies, and from the latter in small membrane bound 75 
vesicles ("dense vesicles' 1 ) to the protein bodies (Chrispeels, 1983; Craig and Goodchild, 1984; Lord, 1985). 
That the signal peptide is removed cotranslationally implies that the signals directing the further transport of 
seed storage proteins to the protein bodies must reside in the remainder of the protein sequence present. 
Zeins and perhaps some other prolaminins deviate from this pathway; indeed the protein bodies are formed by 
budding directly off of the endoplasmic reticulum (Larkins and Hurkman, 1978). As already of record, 2S 20 
albumins contain sequences at the amino end of the precursor other than the signal sequence which are not 
present in the mature polypeptide. This is not general to all storage proteins. This amino terminal processed 
fragment is labeled ATPF in figure 1. 

In addition, as shown in figure 1, several aminoacids located between the small and large subunits in the 
precursor are removed (labeled IPF in the figure, which stands for internal processed fragment). Furthermore, 25 
several residues are removed from the carboxyl end of the precursor (labeled CTPF in the figure which stands 
for carboxyl terminal processed fragment). The cellular location of these latter processing steps is uncertain, 
but is most likely the protein bodies (Chrispeels et al., 1983; Lord, 1985). As a result of these processing steps 
the small subunit and the large subunit remain. These are linked by disulfide bridges, as discussed below. 

When the protein sequences of 2S albumins of different plants are compared strong structural similarities 30 
are observed. This is more particularly illustrated by figure 2 which provides the aminoacid sequences of the 
small subunit and large subunit respectively of representative 2S storage seed albumin proteins of different '. 
plants, i.e.,: 

R. comm. : Ricinus communis 

A. thali.: Arabidopsis thaliana 35 

B. napus : Brassica napus 

B. excel.: Bertholletia excelsia (Brazil nut) 
It must be noted that in Fig. 2: 

- the aminoacid sequences of said subunits extend on several lines; the cysteine groups of the aminoacid 
sequences of the exemplified storage proteins and identical aminoacids in several of said proteins have been 40 
brought into vertical alignment; the hyphen signs which appear in some of these sequences represent absent 
aminoacids, in other words direct linkages between the closest aminoacids which surrounded them; 

- the aminoacid sequences which in the different proteins are conserved are framed. 

It will be observed that all the sequences contain eight cysteine residues (the first and second in the small 
subunit, the remainder in the large subunit) which could participate in disulfide bridges as diagrammatically 45 
shown in Fig. 3, which represents a hypothetical model (for the purpose of the present discussion) rather than 
a representation of the true structure of the 2S albumin of Arabidopsis thaliana . 

Said hypothetical model has been inspired by the disulfide bridge mediated loop-formation of animal 
albumins, such as serum albumins (Brown, 1976), alpha-fetoprotein (Jagodzinski et al., 1987; Morinaga et al., 
1983) and the vitamine D binding protein where analogous constant C-C doublets and C-X-C triplets were 50 
observed (Yang et al., 1985). 

As can be seen on Fig. 2, the regions which are intercalated between the first and second cysteines, 
between the fifth and sixth cysteines, and between the seventh and eight cysteines of the mature protein show 
a substantial degree of conservation or similarity. It would thus seem that these regions are in some way 
essential for the proper folding and/or stability of the protein when synthesized in the plants. An exception to 55 
this conservation consist in the distance between the sixth and seventh cysteine residues. This suggests that 
these arrangements are structurally important, but that some variation is permissible in the large subunit 
between said sixth and seventh cysteines where little conservation of aminoacids is observed. An analogous 
suggestion has been made by Slightom and Chee (1987), where the viciline type seed storage proteins from 
peas were compared. These authors indeed suggest that aminoacid replacement mutations designed to 60 
increase the number of sulphur containing aminoacids should be placed in regions which show little or nor 
conservation of aminoacid sequences. The authors however conclude that the proof that such modifications 
can be tolerated will need to be tested in the seeds of transgenic plants. Moreover, the teaching provided in 
their paper on the properties of the through deletion modified storage protein concerns only the influence on 
expression levels and not on processing of said storage proteins. 55 
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Ap embodiment of this invention is the demonstration that a well chosen region of the 2S albumin allows 
variation without altering the properties and correct processing of said modified storage prote.n in plant cells 

° f This" region (grammatically shown in Fig. 3 by an enlarged hatched portion) will in the examples hereafter 
referred to be termed as the "hypervariable region". Fig. 3 also shows the respective positions of the othe 
parts of the precursor sequence, including the P IPF" section separating the small subun.t and large subun.t of 
ihe precursor, as well as the number of aminoacids (aa) in substantially conserved Portions of the prote.n 
subunits cysteine residues. The processing cleavage sites (as determined by Krebbers et al., 1988) are shown 

^SrSlds of many plants contain albumins of approximately the same size as the storage proteins 
discussed above. However, for ease of language, this document will use the term "2S album.ns^ to refer to 
seed proteins whose genes encode a peptide precursor with the general organization shown in figure 1 and 
which are processed to a final form consisting of two subunits linked by disulfide bridges. The process of the . 
invention for producing plants with an increased content of appropriate aminoacids comprises : 
15 cultivating plants obtained from regenerated plant cells or from seeds of plants obtained from said 

regenerated plant cells over one or several generations, wherein the genetic patrimony or information o 
said plant cells, replicable within said plants, includes a nucleic acid sequence, placed under the control 
of a plant promoter, which can be transcribed into the mRNA encoding at least part of the precursor of a 
2S albumin including the signal peptide of said plant, said nucleic acid being hereafter referred to as the 
20 "precursor encoding nucleic acid" -i 

wherein said nucleic acid contains a nucleotide sequence (hereafter termed the relevant sequence ) 
which relevant sequence comprises a nonessential region modified by a heterologous nucleic acid insert 
forming an open reading frame in reading phase with the non modified parts surrounding said insert in 
said relevant sequence. . . 

25 . wherein said insert includes a nucleotide segment encoding a polypeptide containing appropriate 

aminoacids. 

It will be appreciated that under the above mentioned conditions each and every cell of the cultivated plant 
will include the modified nucleic acid. Yet the above defined recombinant or hybrid sequence will be expressed 
?0 at high levels constitutively or only or mostly in certain organs of the cultivated plants dependent on which 
plant promoter has been chosen to conduct its expression. In the case of seed-specific promoters the hybrid 
storage protein will be produced mostly in the seeds. 

It will be understood that the "heterologous nucleic acid insert" defined above consists of an insert which 
contains nucleotide sequences which at least in part, may be foreign to the natural nucleic acid encoding the 
*5 precursor of the 2S albumins of the plant cells concerned and encode the appropriate aminoacids. Most 
generally the segment encoding polypeptide containing said appropriate aminoacids will itself be foreign to the 
natural nucleic acid encoding the precursor of said storage protein. Nonetheless, the term heterologous 
nucleic acid insert" does also extend to an insert containing a segment as above-defined normally present in 
the genetic patrimony or information of said plant cells, the "heterologous" character of said insert then 
40 addressing to the different genetic environment which surrounds said insert. . » * 

in the preceding definition of the process according to the invention the so-called "nonessential region of 
the relevant sequence of said nucleic acid encoding the precursor, consists of a region whose nucleotide 
sequ-nce can be modified either by insertion into it of the above defined insert or by replacement of at least 
part of said nonessential region by said insert, yet without disturbing the stability and correct processing of 
45 said hybrid storage protein as well as its transport into the above-said protein bodies. Sequences consisting of 
said insert or replacement and representing the coding region for a polypeptide containing appropriate 
aminoacids can either be put in as synthetic oligomers or as restriction fragments isolated from other genes, 
as thought by Brown, 1986. The total length of the hybrid storage protein may be longer or shorter than the 
total length of the non-modified 2S albumin. 
50 With respect to the choice of the region to be modified, the present invention is clearly distinguishable from 
ether work which has been done in this field. Reference is made to DD-A-240911 patent from the Akadem.e 
der Wissenschaften der DDR where legumin genes from Vicia faba , (glutine and prolamine) were modified in 
vitro with sequences encoding methionine. As place of insert a natural occurring Pstl site has been chosen. At 
the EM BO workshop "Plant storage protein genes", (Breisach, FRG, September 1986) the authors presented 
55 their work and informed the audience that plant transformation experiments were just started with the 
modified gene. No further results have yet been published. 

Reference is also made to patent application WO-A-87/07299 and corresponding publication of Radke et al., 
1988. These papers describe the modification of the napin gene, which encodes the 2S albumin of Brassjca 
napus . by a nucleotide sequence encoding nine aminoacid residues including 5 consecutive methionines. The 
60 region- of modification is a naturally occurring Sstl site within the region encoding the mature protein. Such a 
modification would result in a insertion directly adjacent to a cysteine residue and moreover in a region 
between two cysteines, namely the 4th and the 5th cysteines of the mature protein which correspond with the 
2nd and 3rd cysteines of the large subunit. whose length is strongly conserved (see above). We believe such a 
modification is likely to disrupt a normal folding and stability of the 2S albumin (see also EP 87 402 348.4). 
65 Moreover, above cited references provide no evidence that the desired modified 2S albumin was successfully 
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synthesized, correctly processed or correctly targeted. 

In the present invention the precursor-coding nucleic acid referred to above may of course originate from 
the same plant species as that which is cultivated for the purpose of the invention. It may however originate 
from another plant species, in line with the teachings of Beachey et al., 1985 and Okamuro et al., 1986 already 
of record. 5 

In a similar manner the plant promoter may originate from the same plant species or from another, subject in 
the last instance to the capability of the host plant's polymerases to recognize it. It may act constitutively or in a 
tissue-specific manner, such as, but not limited to, seed-specific promoters. 

Regions such as the ones at the end of the small subunit, at the beginning or end of the large subunit, show 
differences of such a magnitude that they can be held as presumably having no substantial impact on the final 10 
properties of the protein. The extreme carboxyl terminus of the small subunits and the amino terminus of the 
large subunit may, however, be involved in the processing of the internal processed fragment. A region which 
does not seem essential, consists of the middle position of the region located in the large subunit, between the 
sixth and the seventh cysteine of the nature protein, but not immediately adjacent and at least 3 aminoacids 
separated from said cysteines. Thus in addition to the absence of similarity at the level of the aminoacid 15 
residues, there appears a difference in length which makes that region eligible for substitutions in the longest 
2S albumins and for addition of aminoacids in the shortest 2S albumins or for elongation of both. The same 
should be applicable at approximately of the end of the first third part of the same region between said sixth 
and seventh cysteine; see the sequence of R. communis which is much shorter at that region than the 
corresponding regions of the other exemplified 2S proteins. 20 

It is of course realized that caution must be exorcised against hypotheses based on arbitrary choices as 
concerns the bringing into line of similar parts of proteins which elsewhere exhibit substantial differences. 
Nevertheless such comparisons have proven in other domains of genetics to provide the man skilled in the art 
with appropriate guidance to reasonably infer from local structural differences, on the one hand, and from local 
similarities, on the other hand, in similar proteins of different sources, which parts, of such proteins can be 25 
modified and which parts cannot, when it is sought to preserve some basic properties of the non modified 
protein in the same protein yet locally modified' by a foreign or heterologous sequence. 

The choice of the adequate nonessential regions to be used in the process of the invention will also depend 
on the length of the polypeptide containing the appropriate aminoacids. Basically the method of the invention 
allows the modification of said 2S albumins by the insertion and/or partial substitution into the precursor 30 
nucleic acid of sequences encoding up to 100 aminoacids. 

When the complete protein sequence of the region to be inserted into a 2S albumin has been determined, . 
the nucleotide sequence to encode said protein sequence must be determined. It will be recognized that while 
perhaps not absolutely necessary the codon usage of the encoding nucleic acid should where possible be 
similar to that of the gene being modified. 35 

The person skilled in the art will have access to appropriate computer analysis tools to determine said codon „ 
usage. Any appropriate genetic engineering technique may be used for substituting the insert for part of the 
selected precursor-coding nucleic acid or for inserting it in the appropriate region of said precursor-coding 
nucleic acid. The general in vitro recombination techniques followed by cloning in bacteria can be used for 
making the chimeric genes. Site-directed mutagenesis can be used for the same purposes as further 40 
exemplified hereafter. DNA recombinants, e.g. plasmids suitable for the transformation of plant cells can also 
be produced according to techniques disclosed in current technical literature. The same applies finally to the 
production of transformed plant cells in which the hybrid storage protein encoded by the relevant parts of the 
selected precursor-coding nucleic acid can be expressed. By way of example, reference can be made to the 
published European applications no. 116 718 or to International application WO 84/02913 and, which disclose 45 
appropriate techniques to that effect. 

When designing the sequences rich in appropriate aminoacids, care must be taken that the resulting 
peptide containing said appropriate aminoacids does not influence the stability of the modified 2S aibumin. 
Certain insertions may indeed disrupt the structure of the protein. For example, long stretches of methionines 
may result in rod shaped helices which would result in instabilities due to disruption of normal folding patterns. 50 
Thus such sequences must occasionally include aminoacids which interrupt the helical structure. 

The procedures which have been disclosed hereabove apply to the adequate modification of the 
nonessential region of any of 2S albumins by an heterologous insert containing a DNA sequence encoding a 
peptide containing appropriate aminoacids with nutritional properties and then to the transformation of the 
relevant plants with the chimeric gene obtained for the production of a hybrid protein containing the sequence 55 
of said peptide in the cells of the relevant plant. Needless to say that the person skilled in the art will in all 
instances be able of selecting which of the existing techniques would at best fulfill its needs at the level of each 
step of the production of such modified plants, to achieve the best production yields of said hybrid storage 
protein. 

For instance the following process can be used in order to exploit the capacity of a 2S albumin, to be used 60 
as a suitable vector for the production of plants with increased nutritional value, by inserting in said 2S 
albumins nucleotide codons encoding methionine and/or lysine and/or thryptophane and/or threonine and/or 
phenylalanine and/or leucine and/or valine and/or isoleucine when the corresponding precursor-coding 
nucleic acid has been sequenced. Such process then comprises: 

1) locating and selecting one of said relevant sequences of the precursor-coding nucleic acid which 65 
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comprises a nonessential region encoding a peptide sequence which can be modified by substituting an 
insert tor part of it or by inserting of said insert into it. which modification is compatible with the 
conservation of the configuration of said 2S albumins and this preferable by determining the relative 
positions of the codons which encode the successive cysteine residues in the mature protein or protein 

5 subunits of said 2S albumins and identifying the corresponding successive nucleic acid regions located 

upstream of, between, and downstream of said codons within said sub-sequences of the 
precursor-coding nucleic acid and identifying in said successive regions those parts which undergo 
variability in either aminoacid sequence or length or both from one plant species to another as compared 
with those other regions which do exhibit substantial conservation of aminoacid sequence in said several 

10 plant species, one of said nucleotide regions being then selected for the insertion therein of the nucleic 

acid insert as described hereunder. 
An alternative would consist of studying any 3-D structures which may become available in the future. 

2) inserting a nucleic acid insert in the selected region of said precursor nucleic acid in appropriate 
reading frame relationship with the non-modified parts of said relevant sequence, which insert includes a 

75 determined segment encoding a peptide containing all or part of the above mentioned appropriate 

aminoacids. 

3) inserting the modified precursor-coding nucleic acid obtained in a plasmid suitable for the 
transformation of plant cells which can be regenerated into full seed-forming plants, wherein said 
insertion is brought under the control of regulation elements, particularly a plant promoter capable of 

20 providing for the expression of the open reading-frames associated therewith in said plants; 

4) transforming a culture of such plant cells with such modified plasmid ; 

5) assaying the expression of the chimeric gene encoding the hybrid storage protein and, when 
achieved; 

6) regenerating said plants from the transformed plant cells obtained and growing said plants up to 
25 . maturity. 

In the case the chimeric gene is under the control of a seed specific promoter, growing up the transformed 
plants to seeds must precede step 5) 

Hence embodiment as described under 1} of the invention hereabove provides that in having the hybrid 2S 
albumins in a plant, it will pass the plant protein disulfide isomerase during membrane translocation, thus 
30 increasing the chances that the correct disulfide bridges be formed in the hybrid precursor as in its normal 
precursor situation, on the one hand 

The invention further relates to the recombinant nucleic acids themselves for use in the process of the 
invention; particularly to the 

- recombinant precursor encoding nucleic acid defined in the context of said process; 
35 - recombinant nucleic acids containing said modified precursor encoding nucleic acid under the control of a 
plant promoter, whether the latter originates from the same DNA as that of said precursor coding nucleic acid 
or from another DNA of the same plant from which the precursor encoding nucleic acid is derived, or from a 
DNA of another plant, or from a non-plant organism provided that it is capable of directing gene expression in 
plants. 

40 - vectors, more particularly plant plasmids e.g., Ti-derived plasmids modified by any of the preceding 
recombinant nucleic acids for use in the transformation of the above plant cells. 

The invention also relates to the regenerable source of the hybrid 2S albumin, which is formed of in the cells 
of a seed-forming-plant, which plant cells are capable of being regenerated into the full plant or seeds of said 
seed-forming plants wherein said plants or seeds have been obtained as a result of one or several generations 
45 of the plants resulting from the regeneration of said plant cells, wherein further the DNA supporting the genetic 
information of said plant cells or seeds comprises a nucleic acid or part thereof, including the sequences 
encoding the signal peptide, which can be transcribed in the mRNA corresponding to the precursor of a 2S 
albumin of said plant, placed under the control of a plant specific promoter, and 

. wherein said nucleic acid sequence contains a relevant modified sequence encoding the mature 2S 
SO storage protein or one of the several sub-sequences encoding for the corresponding one or several 

sub-units of said mature 2S albumins, 

. wherein further the modification of said relevant sequence takes place in one of its nonessential regions 
and consists of a heterologous nucleic acid insert forming an open-reading frame in reading phase with 
non modified parts which surround said insert in the relevant sequence, 
55 . wherein said insert consists of a nucleotide segment encoding a peptide containing methionine and/or 

lysine and/or thryptophane and/or threonine and/or phenylalanine, and/or leucine and/or valine and/or 
isoleucine. 

It is to be considered that although the invention should not be deemed as being limited thereto, the nucleic 
60 inserts encoding the above mentioned appropriate aminoacids will in most instances be man-made synthetic 
oligonucleotides or oligonucleotides derived from procaryotic or eucaryotic genes or of from cDNAs derived 
of procaryotic or eucaryotic RNAs, all of which shall normally escape any possibility of being inserted at the 
appropriate places of the plant cells or seeds of this invention through biological processes, whatever the 
nature thereof. In other words, xr-.ese inserts are "non plant variety specific", specially in that they can be 
65 inserted in different kinds of plants which are genetically totally unrelated and thus incapable of exchanging 
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any genetic material by standard biological processes, including natural hybridization processes. 

Thus the invention further relates to the seed forming plants themselves which have been obtained from 
said transformed plant cells or seeds, which plants are characterized in that they carry said hybrid 
precursor-coding nucleic acids associated with a plant promoter in their cells, said inserts however being 
expressed and the corresponding hybrid protein produced in the cells of said plants. 

There follows an outline of a preferred method which can be used for the modification of a 2S albumin gene 
and its expression in the seeds obtained from the transgenic plants. The outline of the method given here is 
followed by a specific example. It will be understood from the person skilled in the art that the method can be 
suitably adapted for the modification of other 2S albumin genes. 

1. Replacement or supplementation of the hypervariable region of the 2S albumin gene by a sequence 
encoding peptide containing appropriate aminoacids which possess nutritional properties. 

Either the cDNA or the genomic clone of the 2S albumin can be used. Comparison of the sequences of the 
hypervariable regions of the genes in figure 2 shows that they vary in length. Therefore if the sequence 
encoding a peptide contain ing the appropriate aminoacids is short and a 2S albumin with a relatively short 
hypervariable region is used, said sequence of interest can be inserted. Otherwise part of the hypervariable 
region is removed, to be replaced by the insert containing a larger segment or sequence encoding the peptide 
containing the appropriate aminoacids. In either case the modified hybrid 2S albumin may be longer than the 
native one. In either case two standard techniques can be applied; convenient restriction sites can be 
exploited, or mutagenesis vectors (e.g. Stanssens et al. 1987) can be used. In both cases, care must be taken 
to maintain the reading frame of the message. 

The sequence encoding the signal peptide of the precursor of the storage protein used either belongs to 
this precursor or can be a substitute sequence coding for the signal peptide or peptides of an heterologous 
storage protein. 

2. The altered 2S albumin coding region is placed under the control of a plant promoter. Preferred 
promoters include the strong constitutive exogeneous plant promoters such as the promoter from 
cauliflower mozaic virus directing the 35S transcript (Odell, J.T. et al., 1985), also called the 35S promoter; 
the 35S promoter from the CAMV isolate Cabb-JI (Hull and Howell, 1987), also called the 35S3 promoter; 
the bidirectional TR promoter which drives the expression of both the V and the 2' genes of the T-DNA 
(Velten et al., 1984). Alternatively a promoter can be utilized which is not constitutive but specific for one 
or more tissues or organs of the plant. Given by way of example such kind promoters may be the light 
inducible promoter of the ribulose-1, 5-bi-phosphate carboxylase small subunit gene (US patent 
application 821, 582), if the expression is desired in tissue with photosynthetic activity, or may be seed 
specific promoters. 

A seed specific promoter is used in order to ensure subsequent expression in the seeds only. This may be 
of particular use, since seeds constitute an important food or feed source. Moreover, this specific expression 
avoids possible stresses on other parts of the plant. In principle the promoter of the modified 2S albumin can 
be used. But this is not necessary. Any other promoter serving the same purpose can be used. The promoter 
may be chosen according to its level of efficiency in the plant species to be transformed. In the examples 
below the 2S albumin promoter from the 2S albumin gene from Arabidopsis is used, which constitutes the 
natural promoter of the 2S albumin gene which is modified in said examples. Needless to say that other seed 
specific promotors may be used, such as the conglycinine promotor from soybean. If a chimeric gene is so 
constructed, a signal peptide encoding region must also be included, either from the modified gene or from 
the gene whose promotor is being used. The actual construction of the chimeric gene is done using standard 
molecular biological techniques described in Maniatis et al., 1982. (see example). 

3. The chimeric gene construction is transferred into the appropriate host plant. 

When the chimeric or modified gene construction is complete it is transferred in its entirety to a plant 
transformation vector. A wide variety of these, based on disarmed (non-oncogenic) Ti-plasmids derived from 
Agrobacterium tumefaciens , are available, both of the binary and cointegration forms (De Blaere et al., 1987). A 
vector including a selectable marker for transformation, usually antibiotic resistance, should be chosen. 
Similarly, the methods of plant transformation are also numerous, and are fitted to the individual plant. Most are 
based on either protoplast transformation (Marton et al., 1979) or formation of a small piece of tissue from the 
adult plant (Horsch et al„ 1985). in the example below, the vector is a binary disarmed Ti-plasmid vector, the 
marker is kanamycin resistance, and the leaf disc method of transformation is used. 
. Calli from the transformation procedure are selected on the basis of the selectable marker and regenerated 
to adult plants by appropriate hormone induction. This again varies with the plant species being used. 
Regenerated plants are then used to set up a stable line from which seeds can be harvested. 

Further characteristics of the invention will appear in the course of the non-limiting disclosure of specific 
examples, particularly on the basis of the drawings in which : 

- Figs. 1 , 2 and 3 refer to overall features of 2S-albumins as already discussed above. The numbers refer 
to the number of aminoacids observed in the different fragments of the protein precursor. 

- Fig. 4 represents the sequence of 1kb fragment containing the Arabidopsis thaliana 2S albumin gene 
and shows related elements. The Ndel site is underlined. 

- Fig. 5 provides the protein sequence of the large subunit of the above Arabidopsis 2S protein together 
with related oligonucleotide sequences. 

- Fig. 6A shows diagrammatically the successive phases of the construction of a chimeric 2S albumin 
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Arabidopsis thaliana gene including the deletion of practically all parts of the hypervariable region and its 
replacement by a Accl site, the insertion of DNA sequences rich in methionine codons, given by way of of 
example in the following disclosure, in the Accl site, particularly through site-directed mutagenesis and 
the cloning of said chimeric gene in plant vector suitable for plant transformation. 
5 - Fig. 6B shows diagrammatically the protein sequence of the large subunit of several Arabidopsis 2S 

albumins and indicates the region removed from the genes encoding said 2S albumins, and shows 
diagrammatically where an Accl site has been created and how oligonucleotides rich in methionine 
codons are inserted into said Accl site in such a way that the open reading frame is maintained. 

- Fig 7 diagrammatically compares the protein sequences of the large subunits of the unmodified 2S 
10 albumin, in which most of the hypervariable region has been deleted, and those of the modified 2S 

albumins. The resulting number of methionine residues are indicated. . 

- Fig. 8 shows the restriction sites and genetic map of a plasmid suitable for the performance of the 
above site-directed mutagenesis. 

- Fig. 9 shows diagrammatically the different steps of the site-directed mutagenesis procedure of 
15 Stanssens et al (1987) as generally applicable to the modification of nucleic acid at appropriate places. 

- Fig. 10 gives the restriction map of pGSC1703A. 

Example I : 

As a first example of the method described, a procedure is given for the production of transgenic plant 
20 seeds with increased nutritional value by having inserted into their genome a modified 2S albumin protein from 
Arabidopsis thaliana having deleted its hypervariable region and replaced by way of example by a methionine 
rich peptide having 7 aminoacids with the following sequence :IMMMMRM.A synthetic oligomer encoding 
said peptide is substituted for essentially the entire part of the hypervariable region in a genomic clone 
encoding the 2S albumin of Arabidopsis thaliana . Only a few aminoacids adjacent to the sixth and seventh 
25 cysteine residues remained. This chimeric gene is under the control of its natural promoter and signal peptide. 
The process and constructions are diagrammatically illustrated in Fig. 6A, 6B and 7. The entire construct is 
transferred to tobacco, Arabidopsis thaliana and Brassica napus plants using an Agrobacterium mediated 
transformation system. Brassica napus is of particular interest, since this crop is widely used as protein source 
for animal feed. Plants are regenerated, and after flowering the seeds are collected and the methionine content 
30 compared with untransformed plants. 

1. Cloning of the Arabidopsis thaliana 2S albumin gene. 

The Arabidopsis thaliana gene has been cloned according to what is described in Krebbers et al., 1988. The 
plasmid containing said gene is called pAT2S1. The sequence of the region containing the gene, which is 
35 called AT2S1, is shown in figure 4. 

2. Deletion of the hypervariable region of AT2S1 gene and replacement by an Accl site. 
Part of the hypervariable region of AT2S1 is replaced by the following oligonucleotide; 

-to . 

5 » - CCA ACC TTG AAA G GT ATA C AC TTG CCC AAC - 3 1 

3 0-mer 

45 

P TLKGIHLPN 



in which the underlined sequences represent the Accl site and the surrounding ones sequences 
50 complementary to the coding sequence of the hypervariable region of the Arabidopsis 2S albumin gene to be 
retained. This results finally in the aminoacid sequence indicated under the oligonucleotide. 
The deletion and substitution of part of the sequence encoding the hypervariable region of AT2S1 is done 
using site directed mutagenesis with the oligonucleotide as primer. The system of Stanssens et al. (1987) is 
used. 

55 The Stanssens et al. method is described in EP 87 402 384.4. It makes use of plasmid pMac5-8 whose 
restriction and genetic map and the positions of the relevant genetic loci are shown in Fig. 8. The arrows 
denote their functional orientation, fdT: central transcription terminator of phage fd; F1-ORI: origin of 
replication of filamentous phage f1 ; ORI: ColE1-type origin of replication; BLA/Ap R : region coding for 
B-!actamase; CAT/Cm R : region coding for chloramphenicol acetyl transferase. The positions of the amber 

60 mutations present in pMc5-8 (the bla-am gene does not contain the Scat site) and 

pMc5-8 ( cat -am ; the mutation eliminates the unique Pvull site) are indicated. Suppression of the cat amber 
mutation in both supE and supF hosts results in resistance to at least 25 ug/ml Cm. pMc5-8 confers resistance 
to ±20ug/ml and 100 ug/ml Ap upon amber-suppression in supE and supF strains respectively. The EcoRI, 
Ball and Ncol sites present in the wild-type cat gene (indicated with an asterisk) have been removed using 

65 mutagenesis techniques. 
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Essentially the mutagenesis round used for the above mentioned substitution is ran as follows. Reference is 
made to Fig. 9 ; in which the amber mutations in the Ap and Cm selectable markers are shown by closed 
circles. The symbol * represents the mutagenic oligonucleotide. The mutation itself is indicated by an 
arrowhead. 

The individual steps of the process are as follows: 5 

- Cloning of the Hindlll fragment of paT2S1 containing the coding region of the AT2S1 gene into pMa5-8 (I). 
This vector carries on amber mutation in the Cm fl gene and specifies resistance to ampicillin. The resulting 
plasmid is designated pMacAT2S1 (see figure 6A step 1). 

- Preparation of single stranded DNA of this recombinant (II) from pseudoviral particles. 

- Preparation of a Hindlll restriction fragment from the complementary pMc type plasmid (III). pMc-type vectors w 
contain the wild type Cm R gene while an amber mutation is incorporated in the Ap resistance marker. 

- Construction of gap duplex DNA (hereinafter called gdDNA) gdDNA (IV) by in vitro DNA/DNA hybridization. 
In the gdDNA the target sequences are exposed as single stranded DNA. Preparative purification of the 
gdDNA from the other components of the hybridization mixture is not necessary. 

- Annealing of the 30-mer synthetic oligonucleotide to the gdDNA (V). 15 

- Filling in the remaining single stranded gaps and sealing of the nicks by a simultaneous in vitro Klenow DNA 
polymerase I / DNA tigase reaction (VI). 

- Transformation of a mutS host, i.e., , a strain deficient in mismatch repair, selecting for Cm resistance. This 
results in production of a mixed plasmid progeny (VII). 

- Elimination of progeny deriving from the template strand (pMa-type) by retransformation of a host unable to 20 
suppress amber mutations (VIII). Selection for Cm resistance results in enrichment of the progeny derived 

from the gapped strand, i.e., , the strand into which the mutagenic oligonucleotide has been incorporated. 

- Screening of the clones resulting from the retransformation for the presence of the desired mutation. The 
resulting piasmid containing the deleted hypervariable region of AT2S1 is called pMacAT2S1 C40 (see figure 6A 

step 2). 25 

3. Insertion of sequences rich in methionine codons into the AT2S1 gene whose sequences encoding the 
hypervariable region have been deleted. 

As stated above when the sequences encoding most of the hypervariable loop were removed an Accl site 
was inserted in its place. The sequences of interest will be inserted into this Accl site, but a second Accl site is . 30 
also present in the Hindlll fragment containing the modified gene. Therefore the Ndel-Hindlll fragment 
containing the modified gene is subcloned into the cloning vector pBR322 (Bolivar, 1977) also cut with Ndet 
and Hindlll. The position of the Ndel site in the 2S albumin gene is indicated in figure 4. The resulting subclone 
is designated pBRAT2S1 (Figure 6A, step 3). 

In principle any insert desired can be inserted into the Accl site in pBRAT2S1. In the present example said 35 
insert encodes the following sequence: I.M.M.M.M.R.M. Therefore complementary oligonucleotides encoding „ 
said peptide are synthesized taking into account the codon usage of AT2S1 and ensuring the the ends of the 
two complementary oligonucleotides are complementary to the staggered ends of the Accl site, as shown 
here (the oligonucleotides are shown in bold type) : 



5 ' GT ATA ATG ATG ATG ATG CGC ATG ATAC .3 » 
3* CA TAT TAC TAC TAC TAC GCG TAG TATG 5 1 



40 



45 



The details of this insertion, showing how the reading frame is maintained, are shown in figure 6B. The two . 
oligonucleotides are annealed and ligated with pBRAT2S1 digested with Accl (figure 6A, step 4). The resulting 
plasmid is designated pAD4. 50 

4. Reconstruction of the complete modified AT2S1 gene with its natural promoter. 

The complete chimeric gene is reconstructed as follows (see figure 6A): The clone pAT2S!Bg contains a 
3.6kb Bglli fragment inserted in the cloning vector pJB65 (Botterman et al., 1987) which encompasses not only 
the 1 .Okb Hindlll fragment containing the coding region of the gene AT2S1 but sufficient sequences upstream 55 
and downstream of this fragment to contain all necessary regulatory elements for the proper expression of the 
gene. This plasmid is cut with Hindlll and the 5.2kb fragment (i.e., that portion of the plasmid not containing the 
coding region of AT2S1) is isolated. The clone pAT2S1 is cut with Hindlll and Ndel and the resulting 320 bp 
Hindlll-Ndel fragment is isolated. This fragment represents the one removed from the modified 2S albumin in 
the construction of pBRAT2S1 (step 3 of figure 6A) in order to allow the insertion of the oligonucleotides in 60 
step 4 of figure 6A to proceed without the complications of an extra Accl site. These two isolated fragments 
are then ligated in a three way ligation with the Ndel-Hindlll fragment from pAD4 (figure 6A, step 5) containing 
the modified coding sequence. Individual tranformants can be screened to check for appropriate orientation of 
the reconstructed Hindlll fragment within the Bglll fragment using any of a number of sites. The resulting 
plasmid, pAD17, consists of a 2S albumin gene modified only in the hypervariable region, surrounded by the 65 



9 



EP 0 318 341 A1 



same flanking sequences and thus the same promoter as the unmodified gene, the entirety contained on a 
Bglll fragment. 

5. Transformation of plants. 

5. The Bglll fragment containing the chimeric gene is inserted into the Bglll site of the binary vector 
pGSC1703A (Fig. 10) (see also Fig. 6A step 6). The resultant plasmid is designated pTAD12. Vector 
pGSC1703A contains functions for selection and stability in both E. coli and A. tumefaciens , as well as a T-DNA 
fragment for the transfer of foreign DNA into plant genomes (Deblaere et al M 1987). It further contains the 
bi-directional TR promotor (Velten et al., 1984) with the neomycin phosphotransferase protein coding region 

10 (neo) and the 3' end of the ocs gene on one side, and a hygromycin transferase gene on the other side, so that 
transformed plants are both kanamycin and hygromycin resistant. This plasmid does not carry an ampicillin 
resistance gene, so that carbenicillin as well as claforan can be used to kill Agrobacterium after the infection 
step. Using standard procedures (Deblaere et al., 1987), pTAD12 is transferred to the Agrobacterium strain 
C58C1Ftif carrying the plasmid pMP90 (Koncz and Schell, 1986). The latter provides in trans the vir gene 

15 functions required for successful transfer of the T-DNA region to the plant genome. This Agrobacterium is 
then used to transform plants. Tobacco plants of the strain SR1 are transformed using standard procedures 
(Deblaere et al., 1987). Calli are selected on 100 ug/ml kanamycin, and resistant calli used to regenerate plants. 

The techniques for transformation of Arabidopsis thaliana and Brassica napus are such that exactly the 
same construction, in the same vector,, can be used. After mobilization to Agrobacterium tumefaciens as 

20 described hereabove, the procedures of Lloyd et al., (1986) and Klimaszewska et al. (1985) are used for 
transformation of Arabidopsis and Brassica respectively. In each case, as for tobacco, calli can be selected on 
100 ug/ml kanamycin, and resistant calli used to regenerate plants. 

in the case of all three species at an early stage of regeneration the regenerants are checked for 
transformation by inducing callus from leaf on media supplemented with kanamycin (see also point 6). 

25 

6. Screening and analysis of transformed plants. 

In the case of all three species, regenerated plants are grown to seed. Since different transformed plants 
can be expected to have varying levels of expression ("position effects", Jones et al., 1985), more than one 

30 tranformant must initially be analyzed. This can in principle be done at either the RNA or protein level; in this 
case seed RNA was prepared as described (Beachy et al., 1985) and northern blots carried out using standard 
techniques (Thomas et al., 1980). Since in the case of both Brassica and Arabidopsis the use of the entire 
chimeric gene would result in cross hybridization with endogeneous genes, oligonucleotide probes 
complementary to the insertion within the 2S albumin were used; the same probe as used to make the 

35 construction can be used. For each species, 1 or 2 individual plants were chosen for further analysis as 
discussed below. 

First the copy number of the chimeric gene is determined by preparing DNA from leaf tissue of the 
transformed plants (Deliaporta et al., 1983) and probing with the oligonucleotide used above. 
The methionine content of the seeds is analyzed using known methods (Joseph and Marsden, 1986; Gehrke 
40 et al.. 1985; Elkin and Griffith, 1985 (a) and (b)). 

Example II 

As a second example of the method described, the same procedure is followed for the production of 
transgenic plant seeds with increased nutritional value by having inserted into their genome a modified 2S 
45 albumin protein from Arabidopsis thaliana having deleted its hypervariable region and replaced by way of 
example by a methionine rich peptide having 24 aminoacids with the following sequence : 
I M M M Q P R G D M M M I M M M Q P' R G M M M 

All different steps going from constructs to transformants as disclosed for example I are executed with the 
only difference that in step 3 the following oligonucleotide has been synthesized and inserted into pBrAT2S1 
50 (the oligonucleotides are shown in bold type) 



60 
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5' GT ATA ATG ATG ATG CAA CCA AGG GGC GAT ATG ATG ATG ATA 

ATG ATG ATG 

3 ' CA TAT TAC TAC TAC GTT GGT TCC CCG GTA TAC TAG TAC TAT 
TAC TAC TAC 



CAA CCA AGG GGC GAT ATG ATG ATG ATA C - 3 ' 
GTT GGT TCC CCG CTA TAC TAC TAC TAT G - 5' 



The relevant plasmids are indicated in figure 6A, details of the insertion in figure 6B and resulting aminoacid 
sequence of the hybrid subunit shown in figure 7. The relevant plasmids as indicated in figure 6A are pAD3, 
pAD7 and pTADIO. 

The examples have thus given a complete illustration of how2S albumin storage proteins can be modified to 
incorporate therein an insert encoding a methionine rich polypeptide followed by the transformation of plant 
cells such as tobacco cells, Arabidopsis cells and Brassica napus cells with an appropriate plasmid containing 
the corresponding modified precursor nucleic acid, the regeneration of the transformed plant cells into 
corresponding plants, the culture thereof up to the seed forming stage, the recovery of the seeds and finally 
the analysis of the methionine content of said seeds compared with the seeds of corresponding non 
transformed plants. 

It goes without saying that the invention is not limited to the above examples. The person skilled in the art 
will in each case be able to choose the desired combination of appropriate aminoacids to be inserted into the 
hypervariable region of the 2S storage protein, in function of the plant he wants to improve with regard to its 
nutritional value and in function of the desired application of the modified plant. 

There follows a list of bibliographic references which have been referred to in the course of the present 
disclosure to the extent when reference has been made to known methods for achieving some of the process 
steps referred to herein or to general knowledge which has been established prior to the performance of this 
invention. All of the said articles are incorporated herein by reference; 

It is further confirmed 

- that plasmid pAT2S1 has been deposited with the DSM on 4879 on October 7, 1988 

- plasmids pMa5-8 has been deposited with the DSM on 4567 and pMc on 4566 on May 3, 1988 

- plasmid pAT2SlBg has been deposited with the DSM on 4878 on October 7, 1988 

- plasmid pGSC1703a has been deposited with the DSM on 4880 on October 7, 1988 
nowithstanding the fact that they all consist of constructs that the person skilled in the art can reproduce them 
from available genetic material without performing any inventive work. 
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2S Albumin As % Of Total Seed Protein 



TABLE 1 



Family, species 
(common name) 


% 


Compositae 

Helianthus annuus 
(sunflower) 


62 


Cruciferae 

Brassica spp, 
(mustard) 


62 


Linaceae 

Linum usitatissimum 
(linseed) 


42 


Leguminosae 

Lupinus polyphyllus 
(lupin) 


38 


Arachis hypogaea 
(peanut) 


20 


Lecythidaceae 

Bertholletia excelsa 
(brazil nut) 


30 


Liliaceae 

Yucca spp. 
(yucca) 


27 


Euphorbiaceae 

Ricinus communis 
(castor bean) 


44 



From Youle and Huang, 1981 
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Claims 

5 

I . A process for producing transgenic plants with increased nutritional value which comprises : 

- cultivating plants obtained from regenerated plant cells or from seeds of plants obtained from said 
regenerated plant cells over one or several generations, wherein the genetic patrimony or information of 
W said plant cells, replicabie within said plants, includes a nucleic acid sequence, encoding a modified 2S 

albumin storage protein derived from a natural 2S albumin storage protein and placed under the control of 
a promotor capable of directing gene expression in said plants ; 

. wherein said nucleic acid encodes at least part of the precursor of said 2S albumin including its signal 
peptide or a signal peptide of another 2S albumin, said nucleic acid being hereafter referred to as the 

15 ^precursor encoding nucleic acid" 

. wherein said nucleic acid contains a nucleotide sequence (hereafter termed the "relevant sequence"), 
which relevant sequence comprises a nonessential region of said 2S albumin modified by a heterologous 
nucleic acid insert or substitution for part of said nonessential region, said insert or substitution forming 
an open-reading phase with the non-modified parts surrounding said insert in said relevant sequence, 

20 . wherein said insert or substitution encodes a polypeptide formed of aminoacids, identical or different 

from one another, with a proportion of at least one appropriate aminoacids selected from lysine, 
methionine, tryptophane, threonine, phenyl-alanine, leucine, valine, isoleucine and arginine in a proportion 
sufficient that said modified 2S albumin storage protein is enriched in at least one of said appropriate 
aminoacid with respect to the contents of the same appropriate aminoacid in the natural storage protein. 

25 2. The process of claim 1 wherein said modified 2S albumin storage protein is derived from a natural 2S 

storage protein which is itself foreign to the transgenic plant. 

3. The process of claim 2 wherein said transgenic plant is a plant which has been transformed with a 
recombinant vector, e.g., a Ti-plasmid derived vector which contained said transformed storage protein. 

4. The process of any of claims 1 to 3 wherein the modified storage protein is derived of a group of 
30 storage proteins obtained from the following plants : 

Ricinus communis 
Arabidopsis thaliana 
Brassica napus 
Bertholletia excelsia 

35 5. The process of any of claims 1 to 4 wherein said modified storage protein contains a number at least 

of said appropriate aminoacid, e.g., lysine, methionine or tryptophane greater by at least 2, preferably by 4 
than the number of the same appropriate aminoacid in the non modified storage protein, 

6. The process of any of claims 1 to 5 wherein said insert is located in the region which extends in the 
non modified 2S albumin between the codons which code for the sixth and the seventh cysteine residues 

40 respectively, whereby the group formed of 3 codons, preferably 6 codons respectively, next to each of 

those which code for said sixth and seventh codons and within said region code for the same aminoacids 
as corresponding groups of 3, preferably 6 codons in the non modified storage proteins. 

7. The process of any of claims 1 to 6, wherein said promoter is the natural promotor of said nucleic 
acid. 

45 8. The process of any of claims 1 to 6, wherein said promoter is heterologous with respect to said 

nucleic acid. 

9. The process of any of claims 1 to 6 wherein said promotor is a constitutive promotor. 
10. The process of any of claims 1 to 6 wherein said promotor is a tissue specific promotor. 

I I . The promotor of claim 10 which is a seed specific promotor. 

50 12. The process of any of claims 1 to 1 1 , wherein the heterologous insert is foreign to the natural nucleic 

acid encoding the precursor of said 2S albumin. 

13. The process of any one of claims 1 to 12, wherein the heterologous insert contains a segment as 
above-defined normally present in the genetic patrimony or information of said seeds or plant cells, the 
'heterologous" character of said insert then addressing to the one or several codons which surround it, 

55 on both sides thereof and which link said segment to the non modified parts of the nucleic acid encoding 

said precursor. 

14. A recombinant DNA which includes a nucleic acid sequence, which can be transcribed into the 
mRNA encoding at least part of the precursor of a 2S albumin including the signal peptide of said plant, 
said nucleic acid being hereafter referred to as the "precursor encoding nucleic acid" : 

60 . wherein said nucleic acid contains a nucleotide sequence (hereafter termed the "relevant sequence"), 

which relevant sequence comprises a nonessential region modified by a heterologous nucleic acid insert 
forming an open reading frame in reading phase with the non modified parts surrounding said insert in 
said relevant sequence. 

. wherein said insert includes a nucleotide segment encoding a polypeptide containing said, appropriate 
65 aminoacids as defined in any of claims 1 to 12. 
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. wherein said precursor coding nucleic acid is placed under the control of a promoter capable of directing 
gene expression in plants. 

15. The recombinant DNA of claim 14 which is a plasmid. 

16. The recombinant DNA of claim 15 which is capable of transforming plant cells and of causing the 
replication of said modified precursor nucleic acid sequence in said plant cells. 

17. The recombinant DNA of claim 16 which is a Ti-derived plasmid. 

18. As a regenerable source of appropriate aminoacids with high nutritional value, which is formed of 
either plant celts of a seed-forming plant, which plant cells are capable of being regenerated into the 
fullplant or seeds of said seed-forming plants wherein said plants or seeds have been obtained as a result 
of one or several generations of the plants resulting from the regeneration of said plant cells, wherein 
further the DNA supporting the genetic information of said plant cells or seeds comprises a nucleic acid 
or part thereof, including the sequences encoding the signal peptide, which can be transcribed in the 
mRNA corresponding to the precursor of a 2S albumin in said plant, placed under the control of a 
promotor capable of directing gene expression in plants, and 

. wherein said nucleic acid sequence contains a relevant modified sequence encoding the mature 2S 
albumin or one of the several sub-sequences encoding for the corresponding one or several sub units of 
said mature storage protein, 

. wherein further the modification of said relevant sequence takes place in one of its non essential regions 
and consists of a heterologous nucleic acid insert forming an open-reading frame in reading phase with 
non modified parts which surround said insert in the relevant sequence, 
. wherein said insert or substitution is as defined in any of claims 1 to 12. 
19". The source'of polypeptide of claim 18, wherein said insert in a synthetic man-made oligonucleotide. 

20. The source of polypeptide of claim 18, wherein said insert is obtained from a prokaryotic or" 
eukaryotic organism. 

21 . The source of polypeptide of claim 1 8, 1 9 or 1 4, wherein the heterologous segment contained in said 
insert encodes a non plant variety specific polypeptide. 
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FIG. 
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FIGURE 3 
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FIGURE A 



ATCTTTATCCA -421 

TATATTGTCTTACCATCAATAGACAATATCCAATGGACCGGTGACCTGCGTGTATAAGTA -361 

ATTTTTCAAGATGCTAAAACTTTTATGTATTTCAGAATTAACCTCCAAAAACATTTATTG -301 

ACACACTACTACTCTTTCCGTATTGACTCTCAACTAGTCATTTCAAAATAATTGACATGT -241 

CAG AACATG AGTTACACATGGTTGCATATTGCAAGTAGACGCGGAAACTTGTCACTTCCT - 1 8 1 

TTACATTTGAGTTTCCAACACCTAATCACGACAACAATCATATAGCTCTCGCATACAAAC - 1 2 1 

AAA CATATG CATGTATTGTTACACGTGAACTCCATGCAAGTCTCTTTTCTCACCTATAAA - 6 1 

TACCAACCACACCTTCACCACATTCTTCACTCGAACCAAAACATACACACATAGCAAAAA -1 

MANKLFLVCAALALCFLLTN 20 

ATGGCAAACAAGTTGTTCCTCGTCTGCGCAGCTCTCGCTCTCTGCTTCCTCCTCACCAAC 60 

*Start SSU 

A"S I Y RTVVfeFEED.DATN*PIG 40 

GCTTCCATCTACCGCACCGTCGTTGAGTTCGAAGAAGATGACGCCACTAACCCCATAGGC 120 

P KM R KCRKEFQKEQHLRACQ*60 

CCAAAAATGAGGAAATGCCGCAAGGAGTTTCAGAAAGAACAACACCTAAGAGCTTGCCAG 1Q0 

♦Processed --> 

Q LMLQQARQGRSD*EFDFE DD 80 

CAATTG ATGCTCCAGCAAGCAAGGC AAGGCCGTAGCGATGAGTTTGATTTCGAAGACG AC 240 
* I^arge subunit --> 

M E N *P Q G 00QEQ 0L FQQCCNE 100 

ATGGAGAACCCACAGGGACAACAGCAGGAACAACAGCTATTCCAGCAGTGCTGCAACGAG 300 

L RQEEPDCVCPTLKQAAKAV 120 

CTTC6CCAGGAAGAGCCAGATTGTGTTTGCCCCACCTTGAAACAAGCTGCCAAGGCCGTT 360 

oligonucleotide 5 ' -CAAGCTGCCAAGTACGGT - 

K y g 

t 

R L Q G QH QPMQVRK IYQTA K* H* 140 

AGACTCCAGGGACAGCACCAACCAATGCAAGTCAGGAAAATTTACCAGACAGCCAAGCAC 420 
GGATTCTTGAAGCAGCACCAAC-3 ' oligo 

G F L K 

End mat. Ig. su.* 

L PNVCDIPQVDVCPFN1*PSF 160 

TTGCCCAACGTTTGCGACATCCCGCAAGTTGATGTTTGTCCCTTCAACATCCCTTCATTC 4 80 

P S F Y * 164 

CCTTCTTTCTACTAAATCTCAAACAAACCCTCAAAGCGTATGAGAGTGTGGTTGTTGATA 540 



TATACATGTTGACACTTGACACATACCACACCTCATCGTGTGTTTTATGATAAATGT 597 
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FIGURE 5 
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Figure 6 A Flow chart of constructions showing succesive steps in the deletion of se- 
quences encoding most of the hypervariable region of the Arabidopsis 
2S and their replacement with sequences rich in methionine codons. 
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Step 2 
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Step 3 
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* pad 7 
+ pad 17 

BglH 



Step 6 pgscl703a 
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* ptad 10 
+ ptad 12 
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M-„rc f> B Comparison of the amino acid sequences of the- large subunit of several 2b 
~ albumins and details of the deletion and substitutions of the hypervan- 

able region of the Arabidopsis 2S albumin 
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Amino acid sequence between the 5th and 7th cysteines resulting from the deletion of the 
HV region ofAT2Sl and insertion of an AccI site 

[Cjvj^PTLKGI HLPNVjgDI 

Details of the insertions into the Accl site made in examples 1 and 2. 
The bases originating from the Accl site are shown in bold print. 



GT' AT AC 
CA TA' TG 



Accl site 



AAA GGT ATA ATG ATG ATG ATG CGC ATG ATA CAC 
TTT CCA TAT TAC TAC TAC TAC GCG TAC TAT GTG 



Example I 



aaa rrr ATA ATG ATG ATG CAA CCA AGG GGC GAT ATG ATG ATG ATA ATG ATG ATG 
Ttt GCA St tJc TAC TAC GTT GGT TGG GCG GTA TAC TAC TAC TAT TAC TAC TAC 

CAA CCA AGG GGC GAT ATG ATG ATG ATA CAC 

GTT GGT TCC GCG CTA TAC TAC TAC TAT GTG Example II 
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F'.curc 7 Amino acid sequences of the large subunits of the Arabi dopsis 2S protein, that 
remaining after the deletion of most of the hypervariable region, and of the two modified 
Arabidopsis 2S proteins. 
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AT2S1 -PQVDVCPFN — IPSFPS 

Del HV -PQVDVCPFN — IPSFPS 

Subst 1 -PQVDVCPFN IPSFPS 

Subst 2 -PQVDVCPFN — IPSFPS 



AT2S1 = unmodified large subunit of AT2S1 
Del HV = deletion of the AT2S1 HV region 
Subst 1 = substitution by IMMMMRM 

Subst 2 = substitution by IMMMQPRGDMMMINIMMQPRGDMMM 



Number of methionines in the large subunit 



AT2S1 1 

Del HV 0 

Subst 1 5 

Subst 2 12 
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